Modification of the structural stability of human serum albumin in rheumatoid arthritis

Differential scanning calorimetry (DSC) can indicate changes in structure and/or concentration of the most abundant proteins in a biological sample via heat denaturation curves (HDCs). In blood serum for example, HDC changes result from either concentration changes or altered thermal stabilities for 7–10 proteins and has previously been shown capable of differentiating between sick and healthy human subjects. Here, we compare HDCs and proteomic profiles of 50 patients experiencing joint-inflammatory symptoms, 27 of which were clinically diagnosed with rheumatoid arthritis (RA). The HDC of all 50 subjects appeared significantly different from expected healthy curves, but comparison of additional differences between the RA and the non-RA subjects allowed more specific understanding of RA samples. We used mass spectrometry (MS) to investigate the reasons behind the additional HDC changes observed in RA patients. The HDC differences do not appear to be directly related to differences in the concentrations of abundant serum proteins. Rather, the differences can be attributed to modified thermal stability of some fraction of the human serum albumin (HSA) proteins in the sample. By quantifying differences in the frequency of artificially induced post translational modifications (PTMs), we found that HSA in RA subjects had a much lower surface accessibility, indicating potential ligand or protein binding partners in certain regions that could explain the shift in HSA melting temperature in the RA HDCs. Several low abundance proteins were found to have significant changes in concentration in RA subjects and could be involved in or related to binding of HSA. Certain amino acid sites clusters were found to be less accessible in RA subjects, suggesting changes in HSA structure that may be related to changes in protein-protein interactions. These results all support a change in behavior of HSA which may give insight into mechanisms of RA pathology.


Introduction
Rheumatoid arthritis (RA) is a systemic inflammatory autoimmune disease characterized by non-articular changes, symmetrical polyarthritis, and congenital symptoms [1,2] proteins (~8 proteins) in the serum [32]. In this study, HDCs were used to characterize the altered serum proteins in RA patients (clinically diagnosed according to symptoms and various biomarker levels). A characteristic HDC shift was seen across samples, and there was a significant relationship between RA diagnosis and HDC appearance. HDC from normal, healthy patients typically contain two peaks, one around 62.8 and 69.8˚C [22]. We saw a distinct decrease in the intensity of the lower temperature peak in RA samples. Mass spectrometry (MS) was then used to identify proteomic differences in the RA vs. non-RA samples, and we also looked for correlations between the proteome and HDC appearance (grouping samples by peak ratio, independent of RA diagnosis). Together, these results allowed us to understand which changes in the RA proteome could be attributed to the observed HDC differences.
After noticing the distinct differences in HDC pattern across samples, we considered two possible mechanisms to explain this HDC shift (Fig 2). First, changes in concentration of abundant proteins, such as human serum albumin (HSA), would alter the intensity of high abundance protein peaks, changing overall HDC shape [22,33] (Fig 2A). In this study, we focus primarily on HSA because it is the most abundant protein in plasma. Second, HDC shape would be significantly altered by a change in thermal stability of abundant proteins, shifting their melting temperatures. For example, loading HSA with a fatty acid (octanoic acid) increases the melting temperature by 5 to 10˚C [34]. Such changes in stability would most likely correlate with a change in tertiary structures of a smaller fraction of the total HSA [35] (Fig 2B). We used MS to explore both concentration differences in HSA (and other abundant proteins), as well as HSA tertiary structure changes (by looking at surface reactivity [36,37]) as potential causes of the characteristic shift in the HDCs. As outlined below, our data support a change in the thermal stability of a portion of the HSA (Fig 2B). This change in thermal stability, along with MS-detected differences, provides clues for HSA structural changes [38,39]. Blood serum samples from subjects that had physician ordered RA panels were used (n = 50). Based on clinical diagnosis from medical professionals, the samples were separated into an RA group (n = 27) and a non-RA group (n = 23). DSC was used to obtain the HDCs, and a characteristic shift was seen between groups. LC/MS-MS experiments were performed to determine the mechanism behind this shift. Quantification and surface amino acid reactivity analyses were performed to determine significant differences in serum proteins between RA and non-RA groups. https://doi.org/10.1371/journal.pone.0271008.g001

Results and discussion
We used a sample population of fifty anonymized serum samples from patients who experienced joint inflammatory symptoms. Since the samples we used were anonymized post-analysis, and the data has no connection to the subjects, no IRB approval was needed. Patients ranged from age 12 to 88, with a median age of 50. Samples were not selected based on gender and the resulting sample set contained males (n = 13) and females (n = 37), matching statistical prevalence of RA [40]. The rheumatoid arthritis panel [41] was conducted for the serum samples by ARUP Laboratories, including a rheumatoid factor (RF) and cyclic citrullinated peptide (CCP) test. Professional medical analysis of symptoms, paired with the CCP and RF levels, classified the 50 samples as coming from RA (n = 27) and non-RA (n = 23) subjects (S1 Data). Note that the serology results show only some of the factors used for RA clinical diagnosis. Other factors (joint involvement, acute phase reactants, and symptoms duration, etc. [42]) were used for diagnosis, but the supplementary diagnostic information was not provided for this study.

Heat denaturation curves
Heat denaturation curves (HDC) were collected with a NanoDSC (TA Instruments, Lindon, UT). Forty-seven HDCs were obtained (HDCs for three subjects were uninterpretable due to errors during the sample injection). As seen in literature, HDCs for healthy subjects have two The decreased intensity of the lower temperature peak in RA samples could be explained by A) a relative HSA concentration decrease, reducing HDC signal intensity, or B) a shift in thermal stability for a portion of the HSA population, shifting the HSA peak to the right, decreasing the first peak's intensity, and increasing that of the second peak. This thermal stability could be the result of altered binding partners. The structural changes can be seen through biased detection of surface modifications on HSA. When cargo is unbound (top right, light green protein), binding sites are surface accessible for modification, when cargo is bound (top right, dark green protein), these sites are occupied, reducing surface accessibility and ability of these sites to be modified. These structural and functional differences at the molecular level could be the explanation for the observed shift in HDC.
https://doi.org/10.1371/journal.pone.0271008.g002 distinct peaks around 63 degrees and 71 degrees, which correspond to the known melting points for HSA and immunoglobulin proteins, respectively [30]. Previous studies showed the low temperature peak at 63˚C is primarily a combination of HSA and haptoglobin (HAPT), in which HSA dominates due to its much higher concentration [30]. The high temperature peak at 71˚C is primarily a combination of Immunoglobulin G (IgG) and Immunoglobulin A (IgA) underlain by the tail of the broad HSA peak [30]. As shown in Fig 3A, the non-RA HDCs show a smaller peak ratio (1.00 ± 0.23), and the RA HDCs show an even more substantial decrease in peak ratio (0.83 ± 0.16). Healthy normal serum samples are reported in the literature with a higher HSA (low temp) peak at 63˚C and a comparatively lower Ig (high temp) peak at 71˚C [30]. A separate study from Garbett et al. shows the 63 to 71 degree peak ratio in a cohort of healthy samples has a much greater intensity for the low temperature peak (~1.5), reaffirming that none of the current samples can be described as healthy [22]. A two-tailed ttest yields a p-value of 0.007 between RA and non-RA subjects, indicating that the HDC peak ratios are statistically different (Fig 3A, S1 and S2 Figs in S1 File, S2 Data). This pattern is consistent with literature and can be seen in other auto-immune disorders such as lupus [23,24]. Similar to literature [25,26], when the HDCs were ranked according to peak ratio (regardless of RA diagnosis), they could be separated into two groups that correlated with the RA diagnosis: low peak ratio (LPR, peak ratio < 1.00, n = 32), and high peak ratio (HPR, peak ratio > 1.00, n = 15) ( Fig 3B). Associating the HPR group with non-RA and the LPR group with RA gives a point-biserial correlation coefficient of 0.3966, meaning that 39.66% of the variability in peak ratio can be attributed to the RA diagnosis. With this association, a threshold ratio of 1.00 splits the samples (for the 47 HDCs obtained) with the smallest misclassification rate (27.7%). Using this threshold, 22 of the 32 samples (68.8%) in the LPR group are classified as RA while 12 of the 15 samples (80.0%) in the HPR groups are classified as non-RA. While 22 of the 27 RA samples (81.5%) are in the LPR group, and 12 of the 22 non-RA samples (54.5%) are in the HPR group (S1 Data). These classification rates are likely impacted by the imperfect specificity and sensitivity of RA diagnosis mentioned earlier, as well as the presence of comorbidities in RA and non-RA subjects.
Several of the non-RA samples are categorized in the LPR group, and this could be the result of other diseases or physiological differences [30] that alter HSA and other serum proteins, such as Lyme Disease, Lupus, or diabetes [22,43]. This seems likely given that all 50 subjects originally came in for testing because they were experiencing symptoms of discomfort and sickness. We are interested in mechanisms behind these HDC shifts (Fig 2), so we used MS to evaluate the differences between both the RA/non-RA subjects and the HPR/LPR groups.

Proteomics
Protein concentrations. The 50 serum samples were individually digested to tryptic peptides and analyzed using mass spectrometry to further explore the difference in protein content between RA and non-RA serum samples. Relative protein quantification analysis (PEAKS Studio_8.5, Bioinformatics Solutions Inc. [44], S3 Data) shows there are no significant differences in protein concentration between RA and non-RA groups or HPR and LPR groups for any of the top eight most abundant proteins (significant changes are defined as proteins with a fold change less than 0.5 fold or greater than 2 and a p-value less than 0.05) (Fig 4, S1 Table in S1 File). The concentration fold change for each protein in each comparison was calculated by evaluating the ratio of RA abundance to non-RA abundance and LPR abundance to HPR abundance. The RA, non-RA, LPR, and HPR protein abundances were defined as the average MS intensity across the samples of each respective group. It is expected that specific This study focuses on the two peaks observed between 55 and 75˚C of the heat denaturation curve. (A) The average normalized HDC curve for non-RA and RA samples, with the difference between the two shown in black. The first peak from HSA is consistently found around 63˚C (low temp peak) and the second Ig peak is always around 71˚C (high temp peak). Inset for A shows the distribution of peak ratios from the HDC of RA and non-RA subjects. The difference in peak ratio between the non-RA and RA groups is statistically significant (p = 0.007) (B) The distribution of peak ratio of all samples, with a peak ratio threshold of 1.00 as the cut-off between the HPR and LPR groups.
https://doi.org/10.1371/journal.pone.0271008.g003 autoantibody concentrations would increase in patients with RA [45][46][47], but since the RA antigen specific Ig population is a relatively small percentage of the entire Ig population, and significant sequence homology exists between immunoglobulins, it is difficult to distinguish target-specific antibodies using MS only. Also, the comparison was not against "healthy" controls, so that lack of significance in Ig could likely be because an Ig increase, non-specific to RA, may have occurred across many of the samples, elevating Ig levels altogether. These results suggest that a change in concentration of abundant serum proteins does not contribute to the decreased HDC peak ratio observed in RA samples.
Aside from the top eight proteins, overall proteomic analysis showed that among the 421 proteins compared, a statistically significant change was only seen in five proteins when comparing RA to non-RA samples ( Fig 4A) and 14 proteins when comparing the HPR and LPR groups (Fig 4B) The only common significant protein between these two groups was C-reactive protein (CRP), a protein known to be associated with systemic inflammation. The significance of CRP in both the RA vs. Non-RA and HPR vs. LPR comparisons indicate that CRP may be involved in mechanisms accounting for the HDC shift seen in RA samples. It is important to note that CRP concentration is most likely upregulated for all subjects relative to healthy controls as has been described previously, but it is significantly lower in both LPR and RA groups. Vitamin D binding protein (VDBP, known to be related to RA [11]) was significantly downregulated in RA samples, and had no significant changes between HPR and LPR groups. These results suggests that although VDBP and other proteins may be associated with RA, their relatively low concentration means they are not directly affecting the HDC shift in RA samples. However, the change in the concentration of these proteins may affect our , and the statistically significant proteins (-1 > log2(fold change) > 1, p-value < 0.05) are indicated in red. C-reactive protein is the only significant protein in both plots. The fold change is calculated in each comparison by dividing RA abundance by non-RA abundance and LPR abundance by HPR abundance for all proteins that were detected in all samples.
Structural changes in high abundance proteins. Since protein concentration doesn't directly account for the difference in HDCs between RA and non-RA samples, and there is also no link between concentration and the HPR and LPR groups. Therefore, we expect, similar to other diseases explored in literature, that the observed HDC shifts among RA patients and the LPR group are caused by changes in thermal stability for one of the most abundant serum proteins. We simulated shifts in the abundance and/or melting temperatures of various percentages of each of the top eight serum proteins (using individually measured HDCs of these abundant proteins from literature [22]) to recapitulate the observed changes. This simulation was simply a hypothesis-generating technique and tested what the resulting HDC would look like after altering the abundance and/or melting temperature of each of the top eight proteins in the non-RA HDCs. We tested an abundance of 25, 50, 75, 150, 200, and 500% and/or a shift in the melting temperature of -15, -10, -5, 5, 10, and 15˚C. We also performed each of these simulations on various percentages of the total protein present (5, 10, 20, 50, 95, 100%). By comparing the simulated HDC shift to the difference curve shown in Fig 3A, and by visually analyzing the similarity of the shifted curve and the RA curve (S3 Fig in S1 File), we found the most plausible explanation for the shift to be an increase in the melting temperature for a small fraction (~10%) of the HSA pool by about 5-15%. Changes in HSA melting temperature could result from new ligand binding, protein interactors, or changes in tertiary structure [48][49][50]. Therefore, we tested for structural changes of HSA through analysis of covalently modified amino acid profiles between the RA and non-RA samples. Both biological and artificially induced modifications were considered. Changes in biological modifications could show altered RA biochemistry, and changes in artificially induced modifications would show variations in surface accessibility of certain regions of a protein. If RA-specific protein conformation changes are responsible for changes in the HDCs, we also expect these amino acid modification (AAmod) profiles between RA and non-RA groups, to be correlated with the observed HDC groups (HPR and LPR).
Protein Prospector (UCSF) and PEAKS studio (Bioinformatics Solutions Inc) were both used for contrasting analysis of the PTM data. Multiple peptide modifications were observed as noncanonical m/z shifts with Protein Prospector, including a modification of +183 m/z, which was the most frequently observed modification (41 peptides) on HSA (S4 Data). PEAKs Studio's analysis of HSA proteins and each amnio acid modification (AAmod) confirmed the +183 m/z modification as an aminoethylbenzenesulfonylflouride modification (AEBSF) which came from the protease inhibitor cocktail added before processing the serum. Thus, AEBSF was an artificially induced, non-biological PTM. HSA had 185 modified sites that were observed in more than 12 of the samples. Of the 185 total AAmod sites on HSA, there were 33 observed modification types, with the top ten most frequent being AEBSF, 42; Dehydration, 28; Hexose, 17; Deamidation, 14; Iodination, 14; Oxidation, 9; Citrulline, 8; Formylation, 5; Amidation, 5; and Di-iodination, 4. 71% of these AAmod sites are specific for only one type of modification (S4 Data). AEBSF was the only AAmod that showed statistically significant differences between RA and non-RA groups (S5 Data). Since the AEBSF modification was synthetically introduced, it is not causing the change in HSA structure but is reporting the fact that the in vivo structure was changed for these reactive sites. Non-RA subjects have, on average, 1.9 times more AEBSF modifications than RA subjects (p = 0.023). Since there were significantly fewer AEBSF modifications in RA subjects, it suggests that AAmod sites are less accessible in RA HSA, suggesting conformational changes or a potential increase in binding partners in RA HSA.
AEBSF as a probe of surface reactivity. AEBSF is an irreversible serine protease inhibitor which can react with surface accessible nucleophilic amino acids such as Serine (S), Lysine (K), Tyrosine (Y), Histidine (H), and the amino-terminus (Fig 5A) [51,52]. Like other good surface modifiers (diethyl pyrocarbonate [37] or diazonium salt [36]), we can use its prevalence to identify changes in surface area accessibility of individual amino acids on proteins between samples. The AEBSF modifications observed on HSA were most frequently observed on lysine (28 different lysine residues), tyrosine (9 different residues), serine (2 different residues), and histidine (2 different residues) (S5 Data).
To visualize patterns in AEBSF modification between samples and across AAmod sites, we used PNNL Inferno [53] to generate a hierarchical grouped heatmap from the patient-specific ion intensities for each modification site (Fig 5B). From our MS data, samples were sorted into clusters with serum samples on the horizontal axis and HSA modification sites on the vertical axis. The hierarchical order separated the samples into 4 groups. Of the 4 groups, two groups have higher signal intensity (H1 & H2), and two groups have lower signal intensity (L1 & L2). H1 has higher signal intensity at sites shown in clusters C1 and C2, H2 has higher signal intensity at the specific sites in cluster C3, and groups L1 and L2 have lower signal intensity across all sites. While various regions of the heatmap could raise interest, the clusters C1, C2, and C3 were selected for further analysis due to their especially high signal. Higher signal intensity indicates a greater level of AEBSF modification, which implies a greater degree of The heatmap generated with PNNL Inferno showing the intensity differences of AEBSF modification at different HSA sites between different samples. The AEBSF modification amino acid number for HSA is listed on the y-axis, and the serum sample number on the x-axis. The samples are separated into 4 groups according to the hierarchy branch of serum samples, from left to right (S5 Data). Group L1 and L2's AEBSF modifications are less intense than group H1 and H2 (L stands for lower intensity and H stands for higher intensity). The three clusters, C1 (green), C2 (purple), and C3 (black), are the most intense AEBSF modification clusters and are examined to characterize the modification further. (C) The bar graph shows the number of RA/non-RA and HPR/ LPR samples expected in each AEBSF modification group (L1, H1, L2, H2). The percentage of RA samples in L1, H1, L2, H2, is 73%, 25%, 69%, and 31% respectively. For LPR, it is 67%, 89%, 73%, and 50%, respectively.
https://doi.org/10.1371/journal.pone.0271008.g005 surface accessibility. Each of these four groups (L1, H1, L2, and H2) are made up of 32.6%, 17.4%, 23.9%, and 26.0% of the serum samples, respectively. Therefore, in Fig 5C, we can visualize our data against the null hypothesis that the same percentage of RA and non-RA samples should be present in each group. Also, given the proportion of HPR samples within the non-RA group and the proportion of LPR samples within the RA group, we can visualize the null hypothesis of how random assignment would distribute the HPR and LPR samples into each subset of the 4 groups (given the size of the L1, H1, L2, and H2 subset groups) (null hypothesis, H o , Fig 5C). As shown in Fig 5C, we found that a much greater proportion of the RA samples were found in the L1 and L2 groups (42% and 35%, respectively) compared to the non-RA samples (17% and 17%, respectively). A lower percentage of RA samples were in the H1 and H2 groups (8% and 15%, respectively) compared to the non-RA samples (26% and 39%, respectively). The L1 and L2 groups contained close to the expected proportion of samples from the HPR and LPR groups, but in the H2 group (containing a high proportion of non-RA samples), we saw a higher-than-expected percentage of HPR samples (40% of HPR samples were in the H2 group, compared to 19% of LPR samples). However, the H1 group (also containing a high proportion of non-RA samples), had a higher-than-expected percentage of LPR samples (23% of LPR samples compared to 7% of HPR samples). This suggests that the high AEBSF frequency at the AAmod sites in clusters C1 and C2, which are in H1 region, are connected to a decrease in HDC peak ratio but an RA-negative diagnosis. In fact, more intense AEBSF modifications at these C1 and C2 sites may give insight into why certain non-RA samples exhibited a low HDC peak ratio (increased surface accessibility from other factors not specific to RA). On the other hand, the high AEBSF frequency at the modification sites in clusters C3 are connected to both a higher HDC peak ratio and non-RA subjects (Fig 5B and 5C), indicating that decreased accessibility of the C3 amino acid binding sites seen in RA samples may be directly linked to the observed HDC shift seen in RA samples.
Together, this pattern suggests that HSA in the RA/LPR groups may have binding partners or other ligand interactors that block those C3 sites. Additionally, the association between HPR and RA samples in the H2 group suggest that the decreased accessibility of C1/C2 AAmod sites, likely due to binding partners or other conformational changes, are unlikely to be the cause of the increased HDC shifts observed in RA HSA. These binding partners could be related to other diseases that the RA-negative (yet still discomforted) patients were experiencing when they came in to be tested for RA.
It should be noted that the clustering in our heatmap in Fig 5B is data-driven using these 50 subjects as a training set. Therefore, statistical inference, error bars, and p-values are not appropriate as we analyze how the data in Fig 5C deviates from our null hypothesis. To test the hypothesis that these patterns can be applied to a population with statistical confidence additional groups of non-RA and RA would need to be collected and compared to our clustered model. Therefore, we compared these observations against previously published literature to gain valuable insight into the connection between these AEBSF groups and the modification sites in relation to specific changes in HSA tertiary structure.
Potential binding surfaces on has. The 3-dimensional structure of HSA has three recognized domains (I, II and III), each with two subdomains ((IA, IB, IIA, IIB, IIIA, IIIB) [54]. There are also nine known binding pockets for long chain fatty acids distributed throughout the three domains. Two drug and drug-like molecule binding sites, Sudlow sites I and II are located in domains IIA and IIIA [55,56], respectively (Fig 6B). The HSA structure and the AAmod sites for each of the three clusters was visualized with UCSF Chimera (version 1.15) [57], with C1 sites in blue, C2 sites in red, and C3 sites in green (Fig 6A). AEBSF modification sites in C1 and C3 are mostly in domain II: 70% and 55%, respectively. AEBSF modification sites in C2 and are mostly in domain I (50%) (Fig 6C, S5 Data). Sudlow Site I (IIA) has the most frequently observed (33%) AEBSF modification sites from all three clusters combined (Fig 6B, S5 Data). The modified amino acids in C1 and C3 are mostly lysine, and mostly tyrosine in C2 (Fig 6D, S5 Data). PyRosetta [58] was used to extract the secondary structure and surface accessible surface area (SASA) scores from a representative crystal structure of HSA (PDB ID: 1N5U [59]); 81% of the modification sites are on an α-helix, and 19% are on a loop (Fig 6E, S5 Data). The average SASA scores of C1, C2, and C3 are 95.9 ± 37.2, 37.1 ± 32.5, and 81.8 ± 37.3 (Fig 6F, S5 Data) where a larger value indicates more surface accessibility. When each AEBSF modification intensity is compared between RA and non-RA subjects in the C1, C2, and C3 clusters, all appear to be less accessible in RA HSA. Multiple two-sample ttests (p-value adjusted) comparing the modification intensity between RA and non-RA samples reveal that three C1 sites (Y263, K359, and H367, in subdomain IIB) and two C2 sites (Y401 and Y497, in subdomain IIIB) have p-values below 0.05, indicating potential RA-specific binding sites (Table 1, indicated in Fig 6A). These significant sites are labeled in Fig 6A. As explained for Fig 5C, we do not expect potential binding partners at these C1 and C2 sites to increase HSA thermal stability because they are associated with more HPR samples. This hypothesis is strengthened by the fact that most C1 and C2 sites (particularly the significant ones) appear on the more outer surfaces of HSA (Fig 6A) and potentially mobile helices (Fig 6E), making them less likely to have a significant impact on overall HSA stability. On the other hand, C3 sites appear to be more concentrated to inner folds of HSA, where a large number of core interactions would need to be broken during denaturation. This, along with the fact that the C3 cluster is associated with more LPR samples, aligns with the hypothesis that decreased surface accessibility at these sites is a marker of RA-specific HSA stabilization and a decreased peak ratio (Fig 5C). Interestingly, no individual C3 sites show statistically significant differences between RA and non-RA groups ( Table 1), but as a whole, we see that there is an enrichment in non-RA subjects with high C3 sites (Fig 5C). This suggests that the HSA structure is modified by dynamic surface interactors like other proteins, rather than covalently cross-linked molecules. Covalently cross-linked proteins or molecules would be more likely to show significantly decreased accessibility at exact sites. On the other hand, our data (less sitespecific changes and more area specific changes) indicates larger, more regional surface interactions.
The most significantly altered C3 amino acid residue (in terms of surface accessibility) is S287. Compared to RA subjects, non-RA subjects had 2.54 times as much AEBSF modification at the S287 site (p = 0.10). In the PDB structure, S287 already appears quite buried in HSA (Fig 6), and its SASA score is only 19.7, the lowest of all C3 sites ( Table 1). The next four most altered C3 sites are K12, H288, Y452, and K136 (all near S287)-sites at which non-RA subjects have 1.63, 1.54, 1.52, and 1.48 times as much AEBSF modification (p-values are 0.14, 0.46, 0.16, and 0.17, respectively). Seven of the C3 sites (including these top 5) are in a small, localized area (oval shaped magnification in Fig 6A) in domain I that could be a plausible binding interface with RA-specific interactors. Binding interactions here could increase the thermal stability of HSA and reduce the surface reactivity of these sites.

Conclusions
In agreement with literature on other diseases [30, 32], we found that the HDC of serum are characteristically shifted (shown by a decreased first/second peak ratio) in all subjects experiencing inflammatory symptoms. Interestingly, RA subjects displayed an even lower peak ratio compared to non-RA subjects, suggesting a more pronounced HDC shift. In comparison, all 15 of the healthy control subjects used by Garbett et al. [22] fall into the HPR group (peak ratio > 1.00, Table 2), 54.5% of non-RA subjects, but only 18.5% RA subjects fell into the HPR group. Our data is consistent with the literature showing that concentrations of the top 8 proteins do not change significantly during RA or other cases of inflammation [60]. Our data supports the proposed mechanism in Fig 2B, that an increase in HSA stability (5-15˚C increase in melting temperature for~10% of HSA) would be a more plausible explanation for difference between non-RA and RA HDCs. CRP, known to defend against infectious agents and play a significant role in the inflammatory response [4,61], is the only protein with a significantly different concentration among both comparisons. Both groups are expected to have elevated CRP concentrations, but relative concentrations are less elevated in both RA and LPR groups, compared to non-RA and HPR groups. At the same time, we observed that RA subjects are underrepresented in the C1, C2, and C3 clusters indicating that RA-positive HSA is less accessible compared to that of non-RA subjects, but LPR only appears to be underrepresented in within the C3 cluster, indicating that it is at those sites in particular that the RA-specific decrease in HSA accessibility is associated with the characteristic HDC shift.
These findings suggest a model consistent with Fig 2B. CRP, produced predominantly by hepatocytes in response to stimulation by IL-6, is known to be a promiscuous interactor and recruiter of proteins [62,63]. For example, CRP binding to immunoglobulin Fc gamma receptors (FcgR) promotes the production of proinflammatory cytokines leading to inflammation [61]. It is possible that during inflammation in non-RA subjects, high levels of CRP associate with HSA binding proteins, changing the structural stability of this portion of the HSA. Since  Table 1 lists the structural and AEBSF modification information for each of the AAmod sites in cluster C1, C2, and C3 shown in Fig 5B. The AA being modified, and its position in Uniport and PDB 1N5U is listed. Th table also includes the secondary structure (SS) and surface accessible surface area (SASA) scores for each of the AAmod sites. The p-value column indicate a two-sample t-test comparing RA and non-RA site accessibility, using AEBSF signal. � : indicates the sites with statistically significant fold changes of AEBSF modification between RA and non-RA groups.

PLOS ONE
To better understand RA pathology and changes in HSA stability, future RA studies should look for potential binding partners by extracting lipids and small molecules from purified serum HSA. Other directions to explore the mechanisms that increase HSA stability are: (1) Using more specific surface modifications or chemical crosslinking reagents to carry out indepth surface probing of HSA, collect specific information about HSA binding partners and coordination changes [93,94], and (2) comparing HSA and CRP protein binding partners in RA and non-RA patients using immunoaffinity purification together with mass spectrometry to understand how a change in CRP concentration could be contributing to HSA interactor changes. Future research into HSA and other related proteins will continue to enhance our understanding of RA-specific pathology and give insights into the development of, and potential treatments for, RA.

Heat denaturation curves
Blood serum samples (n = 50) were obtained from ARUP Laboratories. Samples were prepared in random order for Nano Differential Scanning Calorimetry DSC measurements by first filtering with a 0.45-micron filter. After being degassed, 40 μL of the blood serum was diluted with 960 μL of buffer. The buffer used for dilution was 10 mM phosphate-buffered saline (PBS) (138 mM NaCl, 2.7 mM KCl at pH 7.50). Samples were refrigerated at 4˚C until Nano DSC scans were made. Samples were prepared ten at a time and loaded into the Nano DSC autosampler at 5˚C. Samples were scanned from 20˚to 110˚C at 1 ⁰C/min after a 600 second equilibration period after loading and corrected against a reference cell. The remainder of the undiluted serum samples were used for MS analysis to look for changes in protein concentration, as well as PTM frequency and location.

Calculating the peak values
Calorimetry experimental results were first corrected for the instrument baseline by subtracting a buffer injection control. Nonzero baselines were then corrected by applying a linear baseline between minimum at 25˚C and 82˚C. Scans were finally normalized for the volume of protein injected (supplemental information). We then looked at the raw HDC curve between 25 and 100˚C, setting the minimum of each HDC as 0 and the maximum as 100. This allowed us to take the peak ratio from two positive values. The low temperature peak value (HSA peak) was measured at 63˚C, and the high temperature peak value (Ig peak) was measured at 71˚C. The HSA/Ig peak ratio was then calculated.

Protein digestion
The serum samples were denatured with 6M guanidinium chloride (GdmCl) in 100mM Tris/ HCl (pH 8.5) and protease inhibitor (Sigma-Aldrich, cat #: P8340), then spun at 21,000xg for 20 minutes at 4˚C to remove insoluble cell contents. The supernatant, which contains soluble proteins, was then transferred into new tubes. The BCA assay (Thermo Fisher Scientific cat #: 23227) protocol was followed to measure the protein concentration in each sample. 1.5 μL serum, which contained about 50 μg of protein, was diluted to 50 μL in 1X PBS, and combined with 100 μL 6 M GdmCl. Each sample was transferred into a new tube, then 1.2 μL of 200 mM dithiothreitol (DTT, >99% sigma # D-5545) in water was added (final concentration 5mM) and the mixture was incubated at 55˚C in a sand bath for 15 minutes. The mixture was then cooled for 5 minutes to reduce disulfide bonding. We then added 3.8 μL of 200 mM freshly made iodoacetamide (IAM, 97% sigma # I-670-9) in water (final concentration 15 mM) and incubated for 1 hour at room temperature in the dark to alkylate the reduced proteins.
Next, samples were put onto 30 kDa centrifugal filters and spun at 14,000 g for 10 minutes. Then 100 μL 6M GdmCl in 100mM Tris/HCl (pH 8.5) was added, and the samples were spun at 14,000xg. This was repeated twice. Then 100 μL 25 mM ammonium bicarbonate (ABC) was added, and the samples were spun again at 14,000 g, this was repeated twice. Next, we emptied and cleaned the collection tube with ddH 2 O three times and 100 μL 25 mM ABC was added to the top of the filter.
MS trypsin (Promega gold MS sequencing grade Trypsin #V5111) was added to the solution above the filter in a 1:50 (w/w) trypsin/protein ratio and the samples were incubated at 37˚C overnight on a shaker. After that, each sample was quenched with 300 mM phenylmethylsulfonyl fluoride (PMSF, final concentration 1 mM). Samples were then centrifuged at 14,000xg for 30 minutes, 100 μL of 25 mM ABC was added, and the samples centrifuged again at 14,000 g for 30 minutes. The filtrate was collected in mass spec vials, dried with a Speedvac, and resuspended in 3% acetonitrile (ACN), 0.1% formic acid (FA) to 1 μg/μL.

Mass spectrometry acquisition for proteomics
Data for the 50 samples was acquired in a randomized order. Digested peptides were separated on a Polaris-HR-C18 HPLC chip in a chip cube nano spray source using an Agilent 1260 HPLC followed by positive ESI and mass detection using an Agilent QTOF mass spectrometer (6530B). The mobile phases consisted of MS grade 3% acetonitrile, 0.1% formic acid for Buffer A; and 97% acetonitrile, 0.1% formic acid for Buffer B. A 50-minute gradient was run at 0.3μL/ min flow rate: 0%-5% B Buffer (0-0.5 minutes), 5%-30% B Buffer (0.5-27 minutes), 30%-95% B buffer (27-28 minutes), 95% B Buffer (28-31 minutes), 95% -5% B buffer (31-33 minutes), 5% -95% B buffer (33-35 minutes), 95% -0% B buffer (35-46 min), 0% B (36-49 min). Auto MS/MS fragmentation using variable collision energy determined by ion mass from 290-1700 m/z at 4 spectra/s rate and 250ms/spectrum time, and with an isolation width around 4 m/z. The auto MS/MS method selected precursor ions that were above 2500 counts and have charge state 2 and above for fragmentation. MS/MS scan range 100-1700 m/z, and 10 max processors allowed per cycle. The same spectra were excluded from the MS/MS selection for 0.2 min. This prevented continual acquisition of the same m/z and allowed for other, less abundant species to be acquired by the mass spectrometer.

Protein identification and quantification
Protein identification and quantification were performed with two programs. The first was Protein Prospector developed in the University of California San Francisco Mass Spectrometry Facility, funded by NIH National Institute for General Medical Sciences. The second one was PEAKs Studio 8.5, developed Bioinformatics Solutions Inc. Both programs compared peptide fragmentation against the SwissProt human database downloaded in August 2017 with the following parameters: monoisotopic for precursor mass search type; semispecific for digest mode, 3 missed cleavage allowed; 20 ppm for parent mass error tolerance; 0.5 Da for fragment mass error tolerance; 3 max variable PTM per peptide allowed, with carbamidomethlyation as fixed modification, and oxidation, Pyro-glu from Q and other 9 customized PTM as variable modification (detailed listed in S4 Data) in PEAKS DataBase step; 311 built-in ptm was used in the PEAKS PTM step; 20ppm mass error tolerance and 3 min retention time shift tolerance were used in the label free quantification step. The raw data are available for download at the chorusporject.org (project ID: 1739, experiment ID:3632).

Protein structure analysis
Some analyses performed with UCSF Chimera, developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco, with support from NIH P41-GM103311.

Inferno hierarchical clustering and heatmap analysis
The heatmap for AEBSF on HSA was created using InfernoRDN created by Pacific Northwest National Laboratory (PNNL, [53]). PTM sites were identified and quantified using PEAKS Studio. PTM sites that were at least present in 12 samples were included in Heatmap generation. Files were then loaded into InfernoRDN and Log2 transformed to reduce the noise of outliers in later analysis. A dual-clustered Heatmap was generated with the standard Euclidean modeling parameters. The hierarchical order output was then used to determine the most changed PTM sites between samples and subsequent PTM site groupings. Additionally, the dual-clustering setting allowed for groups to be observed across samples which were statistically examined for correlation with RA diagnosis.
• Sample Information: Includes sample name, age, gender, CCP/RF values, and clinical RA diagnosis. This was provided by ARUP laboratory.
• Classification in each experiment: � HDC Peak Ratio: The ratio of the HSA peak at 63˚C to the Ig peak at 71˚C � HDC group: the group is assigned based on HDC peak ratio. The HPR has peak ratio > 1.00, the LPR has peak ratio < 1.00 � PTM group: the group is assigned based on hierarchical order of the Inferno Heatmap in • protein-peptide: the peptide area exported from LFQ from PEAKs Studio for the 49 samples. This is used for protein quantification, and PTM analysis.